智能论文笔记

MultiWOZ-DF -- A Dataflow implementation of the MultiWOZ dataset

Joram Meron , Victor Guimarães

分类：自然语言处理

2022-11-04

Semantic Machines (SM) have introduced the use of the dataflow (DF) paradigm to dialogue modelling, using computational graphs to hierarchically represent user requests, data, and the dialogue history [Semantic Machines et al. 2020]. Although the main focus of that paper was the SMCalFlow dataset (to date, the only dataset with "native" DF annotations), they also reported some results of an experiment using a transformed version of the commonly used MultiWOZ dataset [Budzianowski et al. 2018] into a DF format. In this paper, we expand the experiments using DF for the MultiWOZ dataset, exploring some additional experimental set-ups. The code and instructions to reproduce the experiments reported here have been released. The contributions of this paper are: 1.) A DF implementation capable of executing MultiWOZ dialogues; 2.) Several versions of conversion of MultiWOZ into a DF format are presented; 3.) Experimental results on state match and translation accuracy.

translated by 谷歌翻译

在\ citep {andreas2020220task面向}中，引入了基于数据流（DF）的对话系统，与许多常用的当前系统相比，具有明显的优势。这伴随着Smcalflow的发布，Smcalflow是一个实际上相关的，手动注释的数据集，比任何可比较的对话数据集更详细且大得多。尽管有这些出色的贡献，但社区尚未表现出对这一方向的进一步兴趣。这种缺乏兴趣的原因是什么？如何鼓励社区朝这个方向进行研究？一种解释可能是，这种方法太复杂了 - 注释和系统。本文认为，这种看法是错误的：1）提出了有关数据集注释的简化格式的建议，2）释放DF执行引擎的实现\ footNote {https://github.com/telepepathylabsai/opendf }，可以用作沙箱，使研究人员可以轻松实施并尝试新的DF对话设计。希望这些贡献将帮助更多的从业者探索基于DF的对话系统的新想法和设计。

translated by 谷歌翻译

SMCALFLOW是针对任务的自然对话的语义详细注释的大量语料库。注释使用数据流方法，其中注释是代表用户请求的程序。尽管这种注释的语料库的可用性，规模和丰富性，但在对话系统研究工作中的使用非常有限，至少部分是由于难以理解和使用注释。为了解决这些困难，本文建议简化SMCALFLOW注释，并发布检查注释的数据流程序所需的代码，这应该使对话系统的研究人员可以轻松地进入基于数据流的实现和各种基于数据流的实现和注释。

translated by 谷歌翻译